Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3055

The token is not renewed properly if it's shared by jobs (oozie) in DelegationTokenRenewer

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      After YARN-2964, there is only one timer to renew the token if it's shared by jobs.
      In removeApplicationFromRenewal, when going to remove a token, and the token is shared by other jobs, we will not cancel the token.
      Meanwhile, we should not cancel the timerTask, also we should not remove it from allTokens. Otherwise for the existing submitted applications which share this token will not get renew any more, and for new submitted applications which share this token, the token will be renew immediately.

      For example, we have 3 applications: app1, app2, app3. And they share the token1. See following scenario:
      1). app1 is submitted firstly, then app2, and then app3. In this case, there is only one token renewal timer for token1, and is scheduled when app1 is submitted
      2). app1 is finished, then the renewal timer is cancelled. token1 will not be renewed any more, but app2 and app3 still use it, so there is problem.

      1. YARN-3055.patch
        18 kB
        Daryn Sharp
      2. YARN-3055.patch
        17 kB
        Daryn Sharp
      3. YARN-3055.002.patch
        3 kB
        Yi Liu
      4. YARN-3055.001.patch
        2 kB
        Yi Liu

        Issue Links

          Activity

          Hide
          hitliuyi Yi Liu added a comment -

          Jian He, Karthik Kambatla and Jason Lowe, can you help to take a look?

          Show
          hitliuyi Yi Liu added a comment - Jian He , Karthik Kambatla and Jason Lowe , can you help to take a look?
          Hide
          hitliuyi Yi Liu added a comment -

          The token is still not be renewed, will update the patch later

          Show
          hitliuyi Yi Liu added a comment - The token is still not be renewed, will update the patch later
          Hide
          hitliuyi Yi Liu added a comment -

          Upload a new patch.

          Show
          hitliuyi Yi Liu added a comment - Upload a new patch.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12691941/YARN-3055.002.patch
          against trunk revision 08ac062.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6322//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6322//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12691941/YARN-3055.002.patch against trunk revision 08ac062. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6322//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6322//console This message is automatically generated.
          Hide
          jianhe Jian He added a comment -

          Meanwhile, we should not cancel the timerTask, also we should not remove it from allTokens.

          IIUC, this is not the case. Because If launcher job first gets added to the appTokens map, DelegationTokenRenewer will not add DelegationTokenToRenew instance for the sub-job. So the tokens in removeApplicationFromRenewal will return empty for the sub-job when the sub-job completes. So the token won’t be removed from the allTokens.

          Show
          jianhe Jian He added a comment - Meanwhile, we should not cancel the timerTask, also we should not remove it from allTokens. IIUC, this is not the case. Because If launcher job first gets added to the appTokens map, DelegationTokenRenewer will not add DelegationTokenToRenew instance for the sub-job. So the tokens in removeApplicationFromRenewal will return empty for the sub-job when the sub-job completes. So the token won’t be removed from the allTokens.
          Hide
          hitliuyi Yi Liu added a comment -

          Is it possible the launcher job finishes firstly, but sub-jobs are still running? If so, the issue exists. If not, then the issue is invalid.

          Show
          hitliuyi Yi Liu added a comment - Is it possible the launcher job finishes firstly, but sub-jobs are still running? If so, the issue exists. If not, then the issue is invalid.
          Hide
          jianhe Jian He added a comment -

          Is it possible the launcher job finishes firstly, but sub-jobs are still running?

          This is an existing issue as discussed in https://issues.apache.org/jira/browse/YARN-2964?focusedCommentId=14252218&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14252218. And a long-term solution is to have a group Id for a group of applications so that the token lifetime is tied to a group of applications instead of a single application.

          Show
          jianhe Jian He added a comment - Is it possible the launcher job finishes firstly, but sub-jobs are still running? This is an existing issue as discussed in https://issues.apache.org/jira/browse/YARN-2964?focusedCommentId=14252218&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14252218 . And a long-term solution is to have a group Id for a group of applications so that the token lifetime is tied to a group of applications instead of a single application.
          Hide
          jlowe Jason Lowe added a comment -

          Agree with Jian. I believe the concern being raised here is not specific to the changes in YARN-2704 and YARN-2964 but a long-standing issue with YARN's handling of delegation tokens shared between jobs.

          Show
          jlowe Jason Lowe added a comment - Agree with Jian. I believe the concern being raised here is not specific to the changes in YARN-2704 and YARN-2964 but a long-standing issue with YARN's handling of delegation tokens shared between jobs.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Bumping this to be a blocker given YARN-3439 is marked so.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Bumping this to be a blocker given YARN-3439 is marked so.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12691941/YARN-3055.002.patch
          against trunk revision 4b3948e.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7220//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7220//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12691941/YARN-3055.002.patch against trunk revision 4b3948e. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7220//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7220//console This message is automatically generated.
          Hide
          daryn Daryn Sharp added a comment -

          Correctly handling the "don't cancel" setting for jobs submitting job has been a recurring issue. We're internally testing a small patch to continue renewing until all jobs using the token(s) have finished. Handling the auto-fetch of proxy tokens proved a bit more difficult so I need to complete the internal patch. I can take this over or post a partial patch if Yi Liu would like to finish it.

          Show
          daryn Daryn Sharp added a comment - Correctly handling the "don't cancel" setting for jobs submitting job has been a recurring issue. We're internally testing a small patch to continue renewing until all jobs using the token(s) have finished. Handling the auto-fetch of proxy tokens proved a bit more difficult so I need to complete the internal patch. I can take this over or post a partial patch if Yi Liu would like to finish it.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Daryn Sharp/Jian He, I briefly looked at the existing patch on this JIRA and it seems like it will work. Can you also take a look?

          Yi Liu, can you see if you can add a test for this in TestDelegationTokenRenewer.java?

          This is the last blocker on 2.7.0 as of today. Appreciate all the help I can get, thanks all.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Daryn Sharp / Jian He , I briefly looked at the existing patch on this JIRA and it seems like it will work. Can you also take a look? Yi Liu , can you see if you can add a test for this in TestDelegationTokenRenewer.java? This is the last blocker on 2.7.0 as of today. Appreciate all the help I can get, thanks all.
          Hide
          daryn Daryn Sharp added a comment -

          On cursory glance, are you sure this isn't going to leak tokens? Ie. does it remove tokens from data structures in all cases or can a token get left in allTokens?

          Show
          daryn Daryn Sharp added a comment - On cursory glance, are you sure this isn't going to leak tokens? Ie. does it remove tokens from data structures in all cases or can a token get left in allTokens?
          Hide
          daryn Daryn Sharp added a comment -

          This appears to go back to the really old days of renewing the token for its entire lifetime. Most unfortunate.

          The renewer looks like it may turn into a DOS weapon. Renewing a token returns the next expiration. The renewer uses a timer to renew 90% before expiration. After the last renewal, the same expiration ("the wall") will be returned as before. 90% of "the wall" eventually becomes a rapid fire renewal. There's an army of 50 threads prepared to fire concurrently.

          My other concern is that it used to be the first job submitted with a given token that determined if the token is to be cancelled. Now any job can influence the cancelling. This patch didn't specifically break that behavior, but the original YARN-2704 did, which precipitated YARN-2964 to break it differently, and now this jira.

          The ramification is we used to tell users to make sure the first job set the conf correctly, and essentially don't worry after that. Now they do have to worry. Any sub-job with the default of canceling tokens will kill the overall workflow. Sub-jobs should not have jurisdiction over the tokens.

          Show
          daryn Daryn Sharp added a comment - This appears to go back to the really old days of renewing the token for its entire lifetime. Most unfortunate. The renewer looks like it may turn into a DOS weapon. Renewing a token returns the next expiration. The renewer uses a timer to renew 90% before expiration. After the last renewal, the same expiration ("the wall") will be returned as before. 90% of "the wall" eventually becomes a rapid fire renewal. There's an army of 50 threads prepared to fire concurrently. My other concern is that it used to be the first job submitted with a given token that determined if the token is to be cancelled. Now any job can influence the cancelling. This patch didn't specifically break that behavior, but the original YARN-2704 did, which precipitated YARN-2964 to break it differently, and now this jira. The ramification is we used to tell users to make sure the first job set the conf correctly, and essentially don't worry after that. Now they do have to worry. Any sub-job with the default of canceling tokens will kill the overall workflow. Sub-jobs should not have jurisdiction over the tokens.
          Hide
          jianhe Jian He added a comment -

          does it remove tokens from data structures in all cases or can a token get left in allTokens?

          I think it does not have a leak. removeFailedDelegationToken will remove the token if renew fails. If there's a leak, the leak exists before YARN-2704.

          The renewer looks like it may turn into a DOS weapon.

          It does seem odd to get the expiration date by renewing the token. But there's just currently no way to get the expiration date other than the renew method.

          Any sub-job with the default of canceling tokens will kill the overall workflow.

          Am I missing something ? I think currently the sub-job won't kill the overall workflow. the sub-job flag will be ignored, if the first job sets the flag.

          Overall, I think overall the current patch will work, other than few comments I have.
          Daryn Sharp, you mentioned you have another patch. could you share the patch or you think the current patch is fine ?

          Show
          jianhe Jian He added a comment - does it remove tokens from data structures in all cases or can a token get left in allTokens? I think it does not have a leak. removeFailedDelegationToken will remove the token if renew fails. If there's a leak, the leak exists before YARN-2704 . The renewer looks like it may turn into a DOS weapon. It does seem odd to get the expiration date by renewing the token. But there's just currently no way to get the expiration date other than the renew method. Any sub-job with the default of canceling tokens will kill the overall workflow. Am I missing something ? I think currently the sub-job won't kill the overall workflow. the sub-job flag will be ignored, if the first job sets the flag. Overall, I think overall the current patch will work, other than few comments I have. Daryn Sharp , you mentioned you have another patch. could you share the patch or you think the current patch is fine ?
          Hide
          daryn Daryn Sharp added a comment -

          It does seem odd to get the expiration date by renewing the token

          The expiration is metadata associated with the token that is only known to the token issuer's secret manager. The correct fix is for the renewer to not reschedule if the next expiration is the same as the last. The bug wasn't a real priority when tokens weren't renewed forever. If we regress to renewing forever, then it does become a problem.

          I think currently the sub-job won't kill the overall workflow.

          Correct, I misread in my haste. It's rather the opposite: sub-jobs can override the original job's request to cancel the tokens.

          I think overall the current patch will work, other than few comments I have.

          It "works" but not in a desirable way. Jason posted my patch we use internally on YARN-3439 which is duped to this jira. I'm updating it to handle the proxy refresh cases and will post shortly. The current semantics of the conf setting and the 2.x changes have been nothing but production blockers. Ref counting will solve this once and for all.

          Show
          daryn Daryn Sharp added a comment - It does seem odd to get the expiration date by renewing the token The expiration is metadata associated with the token that is only known to the token issuer's secret manager. The correct fix is for the renewer to not reschedule if the next expiration is the same as the last. The bug wasn't a real priority when tokens weren't renewed forever. If we regress to renewing forever, then it does become a problem. I think currently the sub-job won't kill the overall workflow. Correct, I misread in my haste. It's rather the opposite: sub-jobs can override the original job's request to cancel the tokens. I think overall the current patch will work, other than few comments I have. It "works" but not in a desirable way. Jason posted my patch we use internally on YARN-3439 which is duped to this jira. I'm updating it to handle the proxy refresh cases and will post shortly. The current semantics of the conf setting and the 2.x changes have been nothing but production blockers. Ref counting will solve this once and for all.
          Hide
          jianhe Jian He added a comment -

          sure, looking forward to your patch.

          The correct fix is for the renewer to not reschedule if the next expiration is the same as the last.

          Sorry, didn't get what you mean. mind clarifying more ? The renew call after getting the new token is solely to retrieve the expiration date for the token. I found given that RM renews all tokens at once for each app on app submission, if renew rescheduling becomes a DOS problem, then app submission situation may be much worse.

          Show
          jianhe Jian He added a comment - sure, looking forward to your patch. The correct fix is for the renewer to not reschedule if the next expiration is the same as the last. Sorry, didn't get what you mean. mind clarifying more ? The renew call after getting the new token is solely to retrieve the expiration date for the token. I found given that RM renews all tokens at once for each app on app submission, if renew rescheduling becomes a DOS problem, then app submission situation may be much worse.
          Hide
          daryn Daryn Sharp added a comment -

          Haven't had a chance to run findbugs. Might grumble about sync dttr.applicationIds. Will check this afternoon.

          Show
          daryn Daryn Sharp added a comment - Haven't had a chance to run findbugs. Might grumble about sync dttr.applicationIds. Will check this afternoon.
          Hide
          daryn Daryn Sharp added a comment -

          The renew at job submission isn't the problem. It's actually very desirable. Years back, a job submitted with bad tokens - that was destined to fail - would be launched anyway. The tasks failed to connect, ipc level retries occurred, then higher level retries occurred, and yarn generally caught all exceptions and retried. Tasks were retried, perhaps the app attempt was retried, etc. In the end, a job that clearly was going to fail might tie up cluster resources for 20+ minutes. Why was it launched when a failed renew could have prevented the problem? Not to mention the renewer was hardcoded to assume the expiration interval was 24h... So much for being able to stress test the renewer with <1m expirations.

          The potential DOS problem is when a token has reached end of life expiration. Let's say the token can be renewed twice. The third and subsequent renews return the same expiration.

          1. t1 = submit + renew
          2. t2 = t1 + renew
          3. t3 = t2
          4. t4 = t2

          The renew timers fire 90% of the delta between now and the next expiration. So as end of life expiration approaches, the timer fires with an increasing frequency. 50 threads doing that virtually non-stop would not be pretty. The solution is stop renewing when the next expiration equals the last expiration. That can be addressed in another jira that's not a blocker because if tokens aren't renewed forever then it's a rare situation.

          Show
          daryn Daryn Sharp added a comment - The renew at job submission isn't the problem. It's actually very desirable. Years back, a job submitted with bad tokens - that was destined to fail - would be launched anyway. The tasks failed to connect, ipc level retries occurred, then higher level retries occurred, and yarn generally caught all exceptions and retried. Tasks were retried, perhaps the app attempt was retried, etc. In the end, a job that clearly was going to fail might tie up cluster resources for 20+ minutes. Why was it launched when a failed renew could have prevented the problem? Not to mention the renewer was hardcoded to assume the expiration interval was 24h... So much for being able to stress test the renewer with <1m expirations. The potential DOS problem is when a token has reached end of life expiration. Let's say the token can be renewed twice. The third and subsequent renews return the same expiration. t1 = submit + renew t2 = t1 + renew t3 = t2 t4 = t2 The renew timers fire 90% of the delta between now and the next expiration. So as end of life expiration approaches, the timer fires with an increasing frequency. 50 threads doing that virtually non-stop would not be pretty. The solution is stop renewing when the next expiration equals the last expiration. That can be addressed in another jira that's not a blocker because if tokens aren't renewed forever then it's a rare situation.
          Hide
          jianhe Jian He added a comment -

          The third and subsequent renews return the same expiration.

          Sorry, I'm still not fully understanding your point. Why would the new token return the same expiration date ? RM is obtaining a brand-new token and the new token has a different expiration date. In addition, the old expiring token is removed and won't be renewed if a new token is acquired. The token renew frequency is not changing.

          Show
          jianhe Jian He added a comment - The third and subsequent renews return the same expiration. Sorry, I'm still not fully understanding your point. Why would the new token return the same expiration date ? RM is obtaining a brand-new token and the new token has a different expiration date. In addition, the old expiring token is removed and won't be renewed if a new token is acquired. The token renew frequency is not changing.
          Hide
          daryn Daryn Sharp added a comment -

          I believe you are describing the behavior of 2.6's new proxy token refresh feature. I won't digress into how broken it appears to be except in the simplest use case. With it off, there is no fetching a new token.

          Show
          daryn Daryn Sharp added a comment - I believe you are describing the behavior of 2.6's new proxy token refresh feature. I won't digress into how broken it appears to be except in the simplest use case. With it off, there is no fetching a new token.
          Hide
          jianhe Jian He added a comment -

          Okay, now I understand what you are saying. I think we are thinking completely two different things. I thought you were saying how the token refresh feature results in this token renewal frequency changing. It's actually an existing problem. Yes, the current token renewal will cause the token renewal frequency to increase towards the end of the token life and I think this has been like that long time ago. thanks for the clarification.

          Show
          jianhe Jian He added a comment - Okay, now I understand what you are saying. I think we are thinking completely two different things. I thought you were saying how the token refresh feature results in this token renewal frequency changing. It's actually an existing problem. Yes, the current token renewal will cause the token renewal frequency to increase towards the end of the token life and I think this has been like that long time ago. thanks for the clarification.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          This patch works too. Some comments:

          handleAppSubmitEvent(): Two apps calling at the same time can renew the same token twice. You should move down the renewal to where it was before.

          I think requestNewHdfsDelegationTokenIfNeeded() previously itself had a bug which tried to cancel+rerenew all HDFS tokens of an app even though only one of its token is close to expiring - perhaps to take care of federated HDFS situation. Doesn't affect this patch though, but can we rename the requestNewHdfsDelegationToken's argument applicationIds to be referringAppIDs for clarity? I'd rename DelegationTokenToRenew.applicationIds to be referringAppIDs too, it was confusing to me.

          You are also ignoring some existing tests, may be just for testing. Let's revert the ignores.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - This patch works too. Some comments: handleAppSubmitEvent(): Two apps calling at the same time can renew the same token twice. You should move down the renewal to where it was before. I think requestNewHdfsDelegationTokenIfNeeded() previously itself had a bug which tried to cancel+rerenew all HDFS tokens of an app even though only one of its token is close to expiring - perhaps to take care of federated HDFS situation. Doesn't affect this patch though, but can we rename the requestNewHdfsDelegationToken's argument applicationIds to be referringAppIDs for clarity? I'd rename DelegationTokenToRenew.applicationIds to be referringAppIDs too, it was confusing to me. You are also ignoring some existing tests, may be just for testing. Let's revert the ignores.
          Hide
          daryn Daryn Sharp added a comment -

          Thanks Vinod, I'll revise this morning. The ignores shouldn't be there. I did that for our internal emergency fix because we I didn't handle proxy refresh tokens so I didn't care the tests failed.

          Show
          daryn Daryn Sharp added a comment - Thanks Vinod, I'll revise this morning. The ignores shouldn't be there. I did that for our internal emergency fix because we I didn't handle proxy refresh tokens so I didn't care the tests failed.
          Hide
          daryn Daryn Sharp added a comment -

          Two apps could double renew tokens (completely benign) before this patch. In practice the possibility is slim and its harmless.

          However, currently it's quite buggy. Both apps renewed and then stomped over each other's dttrs in allTokens. Now both apps reference separate yet equivalent dttr instances, when the intention was only one app should reference a token. A second/duplicate timer task was also scheduled. Haven't bothered to check later fallout from the inconsistencies.

          Patch: A double renew can still occur (unavoidable) but only one timer is scheduled. All apps reference the same dttr instance. Moving the logic down only creates 3 loops instead of 2 loops but I'll do if you feel strongly.

          Show
          daryn Daryn Sharp added a comment - Two apps could double renew tokens (completely benign) before this patch. In practice the possibility is slim and its harmless. However, currently it's quite buggy. Both apps renewed and then stomped over each other's dttrs in allTokens. Now both apps reference separate yet equivalent dttr instances, when the intention was only one app should reference a token. A second/duplicate timer task was also scheduled. Haven't bothered to check later fallout from the inconsistencies. Patch: A double renew can still occur (unavoidable) but only one timer is scheduled. All apps reference the same dttr instance. Moving the logic down only creates 3 loops instead of 2 loops but I'll do if you feel strongly.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12724256/YARN-3055.patch
          against trunk revision 6495940.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

          Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7276//testReport/
          Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7276//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12724256/YARN-3055.patch against trunk revision 6495940. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7276//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7276//console This message is automatically generated.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          You are right that the previous code also had the same issue. I am good with the patch.

          Will check this in unless jenkins or Jian He say no.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - You are right that the previous code also had the same issue. I am good with the patch. Will check this in unless jenkins or Jian He say no.
          Hide
          jianhe Jian He added a comment -

          thanks Daryn, patch looks good to me too. +1

          Not related to this patch, it's a bug in YARN-2704 .We should remove the token from the allTokens, otherwise, it's a leak in allTokens. it can be fixed separately.

                      if (t.token.getKind().equals(new Text("HDFS_DELEGATION_TOKEN"))) {
                        iter.remove();
                        t.cancelTimer();
                        LOG.info("Removed expiring token " + t);
                      }
          
          Show
          jianhe Jian He added a comment - thanks Daryn, patch looks good to me too. +1 Not related to this patch, it's a bug in YARN-2704 .We should remove the token from the allTokens, otherwise, it's a leak in allTokens. it can be fixed separately. if (t.token.getKind().equals( new Text( "HDFS_DELEGATION_TOKEN" ))) { iter.remove(); t.cancelTimer(); LOG.info( "Removed expiring token " + t); }
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Not related to this patch, it's a bug in YARN-2704 .We should remove the token from the allTokens, otherwise, it's a leak in allTokens. it can be fixed separately.

          Good catch. Agree that this is not related to this patch, can you please file a ticket?

          Checking this in now.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Not related to this patch, it's a bug in YARN-2704 .We should remove the token from the allTokens, otherwise, it's a leak in allTokens. it can be fixed separately. Good catch. Agree that this is not related to this patch, can you please file a ticket? Checking this in now.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Committed this to trunk, branch-2 and branch-2.7. Thanks Daryn! Tx for the help Yi and Jian!

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Committed this to trunk, branch-2 and branch-2.7. Thanks Daryn! Tx for the help Yi and Jian!
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #7552 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7552/)
          YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #7552 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7552/ ) YARN-3055 . Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #893 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/893/)
          YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #893 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/893/ ) YARN-3055 . Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #159 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/159/)
          YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #159 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/159/ ) YARN-3055 . Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2091 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2091/)
          YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2091 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2091/ ) YARN-3055 . Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #150 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/150/)
          YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #150 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/150/ ) YARN-3055 . Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #160 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/160/)
          YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca)

          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          • hadoop-yarn-project/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #160 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/160/ ) YARN-3055 . Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java hadoop-yarn-project/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2109 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2109/)
          YARN-3055. Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca)

          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2109 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2109/ ) YARN-3055 . Fixed ResourceManager's DelegationTokenRenewer to not stop token renewal of applications part of a bigger workflow. Contributed by Daryn Sharp. (vinodkv: rev 9c5911294e0ba71aefe4763731b0e780cde9d0ca) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Pulled this into 2.6.1. Patch applied cleanly for the most part except for small merge conflict in the test-case and the usage of java7 diamond operator.

          Ran compilation and TestDelegationTokenRenewer before the push.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Pulled this into 2.6.1. Patch applied cleanly for the most part except for small merge conflict in the test-case and the usage of java7 diamond operator. Ran compilation and TestDelegationTokenRenewer before the push.

            People

            • Assignee:
              daryn Daryn Sharp
              Reporter:
              hitliuyi Yi Liu
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development