Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4653

Document YARN security model from the perspective of Application Developers

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.2
    • Fix Version/s: 2.8.0, 2.7.3, 3.0.0-alpha1
    • Component/s: site
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      What YARN apps need to do for security today is generally copied direct from distributed shell, with a bit of ill-informed superstition being the sole prose.

      We need a normative document in the YARN site covering

      1. the needs for YARN security
      2. token creation for AM launch
      3. how the RM gets involved
      4. token propagation on container launch
      5. token renewal strategies
      6. How to get tokens for other apps like HBase and Hive.
      7. how to work under OOzie

      Perhaps the WritingYarnApplications.md doc is updated, otherwise why not just link to the relevant bit of the distributed shell client on github for a guarantee of staying up to date?

      1. YARN-4653-001.patch
        20 kB
        Steve Loughran
      2. YARN-4653-002.patch
        24 kB
        Steve Loughran
      3. YARN-4653-003.patch
        24 kB
        Steve Loughran
      4. YARN-4653-004.patch
        25 kB
        Steve Loughran

        Issue Links

          Activity

          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Closing the JIRA as part of 2.7.3 release.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
          Hide
          jianhe Jian He added a comment -

          Committed to trunk, branch-2, branch-2.8, branch-2.7

          thanks Steve Loughran !

          Show
          jianhe Jian He added a comment - Committed to trunk, branch-2, branch-2.8, branch-2.7 thanks Steve Loughran !
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #9302 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9302/)
          YARN-4653. Document YARN security model from the perspective of (jianhe: rev dea90c9a86d0b17f36d0bdf24ca0c789dd1de2b6)

          • hadoop-project/src/site/site.xml
          • hadoop-yarn-project/CHANGES.txt
          • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9302 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9302/ ) YARN-4653 . Document YARN security model from the perspective of (jianhe: rev dea90c9a86d0b17f36d0bdf24ca0c789dd1de2b6) hadoop-project/src/site/site.xml hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          0 mvndep 1m 8s Maven dependency ordering for branch
          +1 mvnsite 0m 27s trunk passed
          0 mvndep 0m 17s Maven dependency ordering for patch
          +1 mvnsite 0m 24s the patch passed
          -1 whitespace 0m 0s The patch has 15 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 asflicense 0m 24s Patch does not generate ASF License warnings.
          3m 11s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12787262/YARN-4653-004.patch
          JIRA Issue YARN-4653
          Optional Tests asflicense mvnsite xml
          uname Linux d24823a12d87 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / e9a6226
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10545/artifact/patchprocess/whitespace-eol.txt
          modules C: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: .
          Max memory used 52MB
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/10545/console
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. 0 mvndep 1m 8s Maven dependency ordering for branch +1 mvnsite 0m 27s trunk passed 0 mvndep 0m 17s Maven dependency ordering for patch +1 mvnsite 0m 24s the patch passed -1 whitespace 0m 0s The patch has 15 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 xml 0m 1s The patch has no ill-formed XML file. +1 asflicense 0m 24s Patch does not generate ASF License warnings. 3m 11s Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12787262/YARN-4653-004.patch JIRA Issue YARN-4653 Optional Tests asflicense mvnsite xml uname Linux d24823a12d87 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / e9a6226 whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10545/artifact/patchprocess/whitespace-eol.txt modules C: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: . Max memory used 52MB Console output https://builds.apache.org/job/PreCommit-YARN-Build/10545/console Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Patch 004; write down my understanding of AM token renewal (based on feedback in this JIRA), some final review of the text

          Show
          stevel@apache.org Steve Loughran added a comment - Patch 004; write down my understanding of AM token renewal (based on feedback in this JIRA), some final review of the text
          Hide
          stevel@apache.org Steve Loughran added a comment -

          ok, to confirm then

          1. the token handed off by the RM to the NM to localize is refreshed/updated as needed.
          2. no tokens in the app launch context are refreshed. That is, if it has an out of date hdfs token —that token is not renewed
          3. therefore, to survive AM restart after token failure, your AM has to get the NMs to localize the keytab or make no HDFS accesses until (somehow) a new token has been passed to them from a client.

          This is what I will say

          Show
          stevel@apache.org Steve Loughran added a comment - ok, to confirm then the token handed off by the RM to the NM to localize is refreshed/updated as needed. no tokens in the app launch context are refreshed. That is, if it has an out of date hdfs token —that token is not renewed therefore, to survive AM restart after token failure, your AM has to get the NMs to localize the keytab or make no HDFS accesses until (somehow) a new token has been passed to them from a client. This is what I will say
          Hide
          jianhe Jian He added a comment -

          what about the tokens supplied to the container launch context for the container to start at all?

          sorry, not sure i understand what you mean. in case of MR, any tokens in the containerLaunchContext(supplied by user) will remain the same. Those tokens are not refreshed and will expire eventually. The hdfs token used for localization is indeed refreshed - RM requests a new token on user's behalf and distributes that to NM's localization service. Tokens for any other services (ATS, Hive) supplied by user are not refreshed

          The patch looks good. Only my earlier comment :
          I tried to compile the html file and find that below has some format problem. Only the first line is recognized as the title.

          ### AM keytab distributed via YARN; AM regenerates delegation
          336	tokens for containers.
          
          Show
          jianhe Jian He added a comment - what about the tokens supplied to the container launch context for the container to start at all? sorry, not sure i understand what you mean. in case of MR, any tokens in the containerLaunchContext(supplied by user) will remain the same. Those tokens are not refreshed and will expire eventually. The hdfs token used for localization is indeed refreshed - RM requests a new token on user's behalf and distributes that to NM's localization service. Tokens for any other services (ATS, Hive) supplied by user are not refreshed The patch looks good. Only my earlier comment : I tried to compile the html file and find that below has some format problem. Only the first line is recognized as the title. ### AM keytab distributed via YARN; AM regenerates delegation 336 tokens for containers.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I know the apps need to sort out their own tokens; I've tried to explain that in the long lived services bit.

          I'm wondering about: what about the tokens supplied to the container launch context for the container to start at all?

          Show
          stevel@apache.org Steve Loughran added a comment - I know the apps need to sort out their own tokens; I've tried to explain that in the long lived services bit. I'm wondering about: what about the tokens supplied to the container launch context for the container to start at all?
          Hide
          jianhe Jian He added a comment -

          Below title has some format issue. they need to be at the same line.

          5	### AM keytab distributed via YARN; AM regenerates delegation
          336	tokens for containers.
          

          No? I'm thinking of all tokens supplied to the container launch context,

          I think not. The delegation tokens will be kept renewed by the DelegationTokenRenewer thread every 24 hrs. AM keeps using the same token until the token expired after 7 days.

          What should an app do in terms of running anything in its own process to refresh/renew tokens?

          IIUC, Renew will be done by the DelegationTokenRenewer thread in RM automatically every 24 hr until the final expiration (7 days). After that AM has to get a new token in some way to run beyond 7 days. Or just using keytabs, instead of delegation token like you mentioned.

          Show
          jianhe Jian He added a comment - Below title has some format issue. they need to be at the same line. 5 ### AM keytab distributed via YARN; AM regenerates delegation 336 tokens for containers. No? I'm thinking of all tokens supplied to the container launch context, I think not. The delegation tokens will be kept renewed by the DelegationTokenRenewer thread every 24 hrs. AM keeps using the same token until the token expired after 7 days. What should an app do in terms of running anything in its own process to refresh/renew tokens? IIUC, Renew will be done by the DelegationTokenRenewer thread in RM automatically every 24 hr until the final expiration (7 days). After that AM has to get a new token in some way to run beyond 7 days. Or just using keytabs, instead of delegation token like you mentioned.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Patch 003. Try to clarify renewal vs regeneration, cut the proposed "Regen through reboot" strategy.

          Regarding renewal, i'm now confused about what the application needs to do here.

          What should an app do in terms of running anything in its own process to refresh/renew tokens?

          Show
          stevel@apache.org Steve Loughran added a comment - Patch 003. Try to clarify renewal vs regeneration, cut the proposed "Regen through reboot" strategy. Regarding renewal, i'm now confused about what the application needs to do here. What should an app do in terms of running anything in its own process to refresh/renew tokens?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Wonder how this works. Since container does not have keytab, so no kerberos channel. What kind of authentication is this to get the delegation tokens

          spark uses HTTPS here; AM has a keytab. I'll clarify that.

          RM will not refresh any delegation tokens on AM restart. It'll refresh AMRM token for sure.

          No? I'm thinking of all tokens supplied to the container launch context, the ones needed for localization by the NN, and for other services the app needs (e.g. ATS, Hive, ...). Doesn't the RM do those?

          Show
          stevel@apache.org Steve Loughran added a comment - Wonder how this works. Since container does not have keytab, so no kerberos channel. What kind of authentication is this to get the delegation tokens spark uses HTTPS here; AM has a keytab. I'll clarify that. RM will not refresh any delegation tokens on AM restart. It'll refresh AMRM token for sure. No? I'm thinking of all tokens supplied to the container launch context, the ones needed for localization by the NN, and for other services the app needs (e.g. ATS, Hive, ...). Doesn't the RM do those?
          Hide
          jianhe Jian He added a comment -

          Thanks Steve !Great material !
          Some questions/comments I have

          It is the responsibility of the application to renew all tokens other than the AMRM and timeline tokens.

          I personally feel here the 'renew' word is a bit confusing. Two kinds of 'renew' we have. 1) Before tokens' final expiration, tokens submitted via applicaionSubmissionContext are automatically renewed by DelegationTokenRenwer in RM. 2) After the token final expiration, application has to re-renew(or 're-fetch') the token by themselves.
          Should we clarify these two?

          The AM must implement an IPC interface which permits containers to request a new set of delegation tokens; this interface must itself use authentication and ideally wire encryption.

          Wonder how this works. Since container does not have keytab, so no kerberos channel. What kind of authentication is this to get the delegation tokens ?

          Before a delegation token is due to expire, the processes running in the containers must request new tokens from the Application Master, over the IPC channel.

          Not clear to me how this works. Say, if container wants to get a new hdfs delegation token, how does it get the new hdfs delegation token from AM? Is it because AM gets a new hdfs delegation token in the first place which then passed to container when container asks for it?

          Because the RM refreshes tokens on AM restart,

          Correct me if I'm wrong, RM will not refresh any delegation tokens on AM restart. It'll refresh AMRM token for sure.

          A thread or executor is started to renew threads on a regular basis.

          should it be "is started to renew 'tokens' " ?

          Show
          jianhe Jian He added a comment - Thanks Steve !Great material ! Some questions/comments I have It is the responsibility of the application to renew all tokens other than the AMRM and timeline tokens. I personally feel here the 'renew' word is a bit confusing. Two kinds of 'renew' we have. 1) Before tokens' final expiration, tokens submitted via applicaionSubmissionContext are automatically renewed by DelegationTokenRenwer in RM. 2) After the token final expiration, application has to re-renew(or 're-fetch') the token by themselves. Should we clarify these two? The AM must implement an IPC interface which permits containers to request a new set of delegation tokens; this interface must itself use authentication and ideally wire encryption. Wonder how this works. Since container does not have keytab, so no kerberos channel. What kind of authentication is this to get the delegation tokens ? Before a delegation token is due to expire, the processes running in the containers must request new tokens from the Application Master, over the IPC channel. Not clear to me how this works. Say, if container wants to get a new hdfs delegation token, how does it get the new hdfs delegation token from AM? Is it because AM gets a new hdfs delegation token in the first place which then passed to container when container asks for it? Because the RM refreshes tokens on AM restart, Correct me if I'm wrong, RM will not refresh any delegation tokens on AM restart. It'll refresh AMRM token for sure. A thread or executor is started to renew threads on a regular basis. should it be "is started to renew 'tokens' " ?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Patch -002; covers token extraction, renewal and cancellation.

          I've run out of things to say here; I'd be grateful for some proofreading and then a positive vote.

          This patch is available as a pull request at https://github.com/apache/hadoop/pull/72 if people want to comment there.

          Show
          stevel@apache.org Steve Loughran added a comment - Patch -002; covers token extraction, renewal and cancellation. I've run out of things to say here; I'd be grateful for some proofreading and then a positive vote. This patch is available as a pull request at https://github.com/apache/hadoop/pull/72 if people want to comment there.
          Show
          stevel@apache.org Steve Loughran added a comment - Rendered doc is at : https://github.com/steveloughran/hadoop/blob/HADOOP-12649-security/YARN-4653-yarn/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          0 mvndep 0m 14s Maven dependency ordering for branch
          +1 mvnsite 0m 24s trunk passed
          0 mvndep 0m 13s Maven dependency ordering for patch
          +1 mvnsite 0m 24s the patch passed
          -1 whitespace 0m 0s The patch has 17 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 xml 0m 0s The patch has no ill-formed XML file.
          +1 asflicense 0m 18s Patch does not generate ASF License warnings.
          1m 52s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:0ca8df7
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12785533/YARN-4653-001.patch
          JIRA Issue YARN-4653
          Optional Tests asflicense mvnsite xml
          uname Linux ac13bc823499 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 8f2622b
          whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10458/artifact/patchprocess/whitespace-eol.txt
          modules C: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: .
          Max memory used 29MB
          Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org
          Console output https://builds.apache.org/job/PreCommit-YARN-Build/10458/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. 0 mvndep 0m 14s Maven dependency ordering for branch +1 mvnsite 0m 24s trunk passed 0 mvndep 0m 13s Maven dependency ordering for patch +1 mvnsite 0m 24s the patch passed -1 whitespace 0m 0s The patch has 17 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 xml 0m 0s The patch has no ill-formed XML file. +1 asflicense 0m 18s Patch does not generate ASF License warnings. 1m 52s Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12785533/YARN-4653-001.patch JIRA Issue YARN-4653 Optional Tests asflicense mvnsite xml uname Linux ac13bc823499 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8f2622b whitespace https://builds.apache.org/job/PreCommit-YARN-Build/10458/artifact/patchprocess/whitespace-eol.txt modules C: hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: . Max memory used 29MB Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org Console output https://builds.apache.org/job/PreCommit-YARN-Build/10458/console This message was automatically generated.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Patch -001, document with checklist.

          Show
          stevel@apache.org Steve Loughran added a comment - Patch -001, document with checklist.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          thanks for the link ... hadn't seen that. nice. That's a document which should be linked to, ideally even pulled into the hadoop site

          I'm doing something less ambitious but equally important: explain to application developers what they need. I'll change the title accordingly

          Show
          stevel@apache.org Steve Loughran added a comment - thanks for the link ... hadn't seen that. nice. That's a document which should be linked to, ideally even pulled into the hadoop site I'm doing something less ambitious but equally important: explain to application developers what they need. I'll change the title accordingly
          Hide
          drankye Kai Zheng added a comment -

          Great to see this, thanks Steve Loughran. If I understand correctly, this would be a nice update or upgrade to the documentation in HADOOP-9621. Linked it in case it helps.

          Show
          drankye Kai Zheng added a comment - Great to see this, thanks Steve Loughran . If I understand correctly, this would be a nice update or upgrade to the documentation in HADOOP-9621 . Linked it in case it helps.

            People

            • Assignee:
              stevel@apache.org Steve Loughran
              Reporter:
              stevel@apache.org Steve Loughran
            • Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 2h
                2h
                Remaining:
                Remaining Estimate - 2h
                2h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development