Hadoop Common
  1. Hadoop Common
  2. HADOOP-10769

Create KeyProvider extension to handle delegation tokens

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0
    • Fix Version/s: 2.6.0
    • Component/s: security
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      The KeyProvider API needs to return delegation tokens to enable access to the KeyProvider from processes without Kerberos credentials (ie Yarn containers).

      This is required for HDFS encryption and KMS integration.

      1. HADOOP-10769.1.patch
        5 kB
        Arun Suresh
      2. HADOOP-10769.2.patch
        7 kB
        Arun Suresh
      3. HADOOP-10769.3.patch
        8 kB
        Arun Suresh

        Activity

        Alejandro Abdelnur created issue -
        Hide
        Larry McCay added a comment -

        Hey Alejandro Abdelnur - this strikes me as an abstraction leak from a specific provider type implementation. I don't like it being required for all provider implementations to implement and have to return null or something. What other options do we have to make this flow work?

        Show
        Larry McCay added a comment - Hey Alejandro Abdelnur - this strikes me as an abstraction leak from a specific provider type implementation. I don't like it being required for all provider implementations to implement and have to return null or something. What other options do we have to make this flow work?
        Hide
        Alejandro Abdelnur added a comment -

        Larry McCay, this is required for any distributed implementation of the KeyProvider to work.

        This is exactly what Hadoop FileSystem API does, the base FileSystem has the following method:

          public Token<?> getDelegationToken(String renewer) throws IOException {
            return null;
          }
        

        If this is specific to a provider impl, then HDFS would have to cast to the specific provider impl to get it, which is not good.

        Show
        Alejandro Abdelnur added a comment - Larry McCay , this is required for any distributed implementation of the KeyProvider to work. This is exactly what Hadoop FileSystem API does, the base FileSystem has the following method: public Token<?> getDelegationToken( String renewer) throws IOException { return null ; } If this is specific to a provider impl, then HDFS would have to cast to the specific provider impl to get it, which is not good.
        Hide
        Larry McCay added a comment -

        That isn't exactly accurate, it is required by a distributed implementation that requires delegation tokens - which is an anti-pattern in my mind. A better approach is to get all keys upfront and add them to the credentials object as is done with delegation tokens - I understand that this has been discussed on the other jiras but that doesn't mean that all other implementations should be required to have it. I also don't like that HDFS will always call this and have to deal with null when a different provider is configured. It is an abstraction leak.

        I don't want to debate whether delegation tokens should be used or not - I'd like to pursue other ways for you to get the token without changing the provider interface to explicitly call it out as a requirement.

        Perhaps, we could add a call for creating an execution context where we pass in some properties and get back a context with additional properties.

            public HashMap<String, Object> getKeyProviderContext(HashMap<String,Object> properties);
        

        This would be essentially a means to get a provider some callbacks into the execution environment and a way to pass provider specific context back to the execution environment.

        This isn't completely thought through but this is the sort of direction that I would like rather than making implementation details part of the provider interface. What do you think?

        Show
        Larry McCay added a comment - That isn't exactly accurate, it is required by a distributed implementation that requires delegation tokens - which is an anti-pattern in my mind. A better approach is to get all keys upfront and add them to the credentials object as is done with delegation tokens - I understand that this has been discussed on the other jiras but that doesn't mean that all other implementations should be required to have it. I also don't like that HDFS will always call this and have to deal with null when a different provider is configured. It is an abstraction leak. I don't want to debate whether delegation tokens should be used or not - I'd like to pursue other ways for you to get the token without changing the provider interface to explicitly call it out as a requirement. Perhaps, we could add a call for creating an execution context where we pass in some properties and get back a context with additional properties. public HashMap< String , Object > getKeyProviderContext(HashMap< String , Object > properties); This would be essentially a means to get a provider some callbacks into the execution environment and a way to pass provider specific context back to the execution environment. This isn't completely thought through but this is the sort of direction that I would like rather than making implementation details part of the provider interface. What do you think?
        Hide
        Alejandro Abdelnur added a comment -

        Larry McCay, if the KeyProvider is not accessible from services/tasks in the cluster it is pretty much useless. And DelegationTokens enable secure access to it from processes without Kerberos credentials. A local KeyProvider (such as JavaKeyStore or User) only serve special and restricted usecases where distributed access is not required. Given that, I believe getting delegation tokens belongs the basic KeyProvider API.

        Show
        Alejandro Abdelnur added a comment - Larry McCay , if the KeyProvider is not accessible from services/tasks in the cluster it is pretty much useless. And DelegationTokens enable secure access to it from processes without Kerberos credentials. A local KeyProvider (such as JavaKeyStore or User) only serve special and restricted usecases where distributed access is not required. Given that, I believe getting delegation tokens belongs the basic KeyProvider API.
        Hide
        Larry McCay added a comment -

        I'm sorry for not making my point more clearly.

        Say we want a key provider for an external key system that does not use delegation tokens but some other token instead.
        Should we add a getXToken as well?

        I am just trying to abstract away things like authentication tokens required by proprietary providers and at the same time accommodate the KMS provider without imposing this method on every provider.

        So, if we were to create an execution context that we can then add to the credentials object then it could be picked up by the services/tasks at runtime. Unfortunately, we will have to know about certain names in order to put them in through the right method and get them out from the right method. Unless we added a new method for setting/getting the whole context....?

        I'm not sure what you are getting at with the "if the KeyProvider is not accessible from services/tasks in the cluster it is pretty much useless." statement. How would a more generic approach to getting required tokens make the key provider less accessible?

        Anyway, I would be more comfortable with a more generic approach to this issue. This is after all an SPI contract for accommodating arbitrary providers. If KMS has a requirement for extra context information at runtime then others likely do as well.

        Show
        Larry McCay added a comment - I'm sorry for not making my point more clearly. Say we want a key provider for an external key system that does not use delegation tokens but some other token instead. Should we add a getXToken as well? I am just trying to abstract away things like authentication tokens required by proprietary providers and at the same time accommodate the KMS provider without imposing this method on every provider. So, if we were to create an execution context that we can then add to the credentials object then it could be picked up by the services/tasks at runtime. Unfortunately, we will have to know about certain names in order to put them in through the right method and get them out from the right method. Unless we added a new method for setting/getting the whole context....? I'm not sure what you are getting at with the "if the KeyProvider is not accessible from services/tasks in the cluster it is pretty much useless." statement. How would a more generic approach to getting required tokens make the key provider less accessible? Anyway, I would be more comfortable with a more generic approach to this issue. This is after all an SPI contract for accommodating arbitrary providers. If KMS has a requirement for extra context information at runtime then others likely do as well.
        Hide
        Alejandro Abdelnur added a comment -

        I got it now. I think in Hadoop-land we are pretty much standardize in DelegationTokens (HDFS, Yarn, WebHdfs, HiveMetaStore, Hbase) and we have a generic mechanism to distribute them. Given that, I would say and external KeyProvider impl either uses KMS or wraps its authenticatedToken within a DelegationToken implementation.

        Show
        Alejandro Abdelnur added a comment - I got it now. I think in Hadoop-land we are pretty much standardize in DelegationTokens (HDFS, Yarn, WebHdfs, HiveMetaStore, Hbase) and we have a generic mechanism to distribute them. Given that, I would say and external KeyProvider impl either uses KMS or wraps its authenticatedToken within a DelegationToken implementation.
        Hide
        Larry McCay added a comment -

        Relegating a provider to being plugged into KMS which will require delegation tokens and whatever the external provider needs in the first place defeats the purpose of a generic KeyProvider API. In fact, I'm not sure how you would accommodate such a provider to begin with.

        Also, making them wrap their token would be unnecessary and potentially not enough. There may even be need for other context values for a given provider.

        Show
        Larry McCay added a comment - Relegating a provider to being plugged into KMS which will require delegation tokens and whatever the external provider needs in the first place defeats the purpose of a generic KeyProvider API. In fact, I'm not sure how you would accommodate such a provider to begin with. Also, making them wrap their token would be unnecessary and potentially not enough. There may even be need for other context values for a given provider.
        Hide
        Alejandro Abdelnur added a comment -

        It is not the intention, at all, to relegate external providers to be plugged into KMS to work.

        The intention is to enable the Hadoop KeyProvider API with a security pattern managed and understood by existing Hadoop services: storage, propagation, renewal of tokens is already handled throughout the platform. The DelegationToken framework already does all this. And an external provider can fully leverage this without having to be deployed via KMS.

        If you deploy an external provider via KMS you get then additional benefits out of the box: scalability, caching, isolated DEK management.

        Also, note that the getDelegationToken() it does not handle authentication, just getting a delegation token. Authentication is assumed to be done via UGI mechanisms.

        Regarding context values for a given provider, UGI credentials are already used in that way.

        Because of this, IMO, I think we are good with DelegationToken support for now. And I'm happy to consider changes with a concrete example not handled by it arises.

        Show
        Alejandro Abdelnur added a comment - It is not the intention, at all, to relegate external providers to be plugged into KMS to work. The intention is to enable the Hadoop KeyProvider API with a security pattern managed and understood by existing Hadoop services: storage, propagation, renewal of tokens is already handled throughout the platform. The DelegationToken framework already does all this. And an external provider can fully leverage this without having to be deployed via KMS. If you deploy an external provider via KMS you get then additional benefits out of the box: scalability, caching, isolated DEK management. Also, note that the getDelegationToken() it does not handle authentication, just getting a delegation token. Authentication is assumed to be done via UGI mechanisms. Regarding context values for a given provider, UGI credentials are already used in that way. Because of this, IMO, I think we are good with DelegationToken support for now. And I'm happy to consider changes with a concrete example not handled by it arises.
        Hide
        Larry McCay added a comment -

        I fully understand your intent here but you seem to be missing the fact that the provider API is a client side abstraction to an arbitrary key provider or providers.

        If you deploy an external provider via KMS you get then additional benefits out of the box: scalability, caching, isolated DEK management.

        All of the benefits of the KMS are wonderful and can be easily added to simple providers by plugging them into the KMS server. However, more sophisticated key management solutions will provide these themselves and the key provider interface on the client side shouldn't impose the need for a method that is extraneous to the given provider. The need for getting a DelegationToken is a reasonable requirement for a specific provider - in this case the KMSClientKeyProvider but it isn't something that needs to be done for all implementations.

        Also, note that the getDelegationToken() it does not handle authentication, just getting a delegation token. Authentication is assumed to be done via UGI mechanisms.

        Perhaps I am missing something - my understanding is that you need getDelegationToken so that you can get it from the KMS to allow for "authentication" to the KMS later from services/tasks that will get the token from the credentials file for the job submission in order to request a key from the KMS. Is this incorrect?

        My proposal is to allow for this very capability through a more generic contract with the key providers.

        Show
        Larry McCay added a comment - I fully understand your intent here but you seem to be missing the fact that the provider API is a client side abstraction to an arbitrary key provider or providers. If you deploy an external provider via KMS you get then additional benefits out of the box: scalability, caching, isolated DEK management. All of the benefits of the KMS are wonderful and can be easily added to simple providers by plugging them into the KMS server. However, more sophisticated key management solutions will provide these themselves and the key provider interface on the client side shouldn't impose the need for a method that is extraneous to the given provider. The need for getting a DelegationToken is a reasonable requirement for a specific provider - in this case the KMSClientKeyProvider but it isn't something that needs to be done for all implementations. Also, note that the getDelegationToken() it does not handle authentication, just getting a delegation token. Authentication is assumed to be done via UGI mechanisms. Perhaps I am missing something - my understanding is that you need getDelegationToken so that you can get it from the KMS to allow for "authentication" to the KMS later from services/tasks that will get the token from the credentials file for the job submission in order to request a key from the KMS. Is this incorrect? My proposal is to allow for this very capability through a more generic contract with the key providers.
        Hide
        Aaron T. Myers added a comment -

        Larry/Tucu - I sort of half agree with both of you. I agree with Tucu that the suggestion of introducing a new "getKeyProviderContext" API is fairly unprecedented and seems fragile. I also agree with Larry, though, that it's not unreasonable for the KeyProvider API to not know anything about DelegationTokens - seems like separate concerns.

        My suggestion is to use the KeyProviderExtension mechanism being introduced by HADOOP-10719 to add DT support to only the KMSClientKeyProvider. Thoughts?

        Show
        Aaron T. Myers added a comment - Larry/Tucu - I sort of half agree with both of you. I agree with Tucu that the suggestion of introducing a new "getKeyProviderContext" API is fairly unprecedented and seems fragile. I also agree with Larry, though, that it's not unreasonable for the KeyProvider API to not know anything about DelegationTokens - seems like separate concerns. My suggestion is to use the KeyProviderExtension mechanism being introduced by HADOOP-10719 to add DT support to only the KMSClientKeyProvider. Thoughts?
        Hide
        Alejandro Abdelnur added a comment -

        Aaron T. Myers, 'No soup for you!'

        It seems like a reasonable approach.

        Show
        Alejandro Abdelnur added a comment - Aaron T. Myers , 'No soup for you!' It seems like a reasonable approach.
        Hide
        Larry McCay added a comment -

        Hi Aaron T. Myers - That intent seems more reasonable to me but I can't say that I completely understand the mechanics there. The crypto extension provides wrapped key functionality by providing the implementation for it that can be used across all providers. How would we make this only for the KMSClientKeyProvider without adding it to the key provider interface. Would it be up to the consumer of the keyprovider to know whether to wrap it with the extension and to call the getDelegationToken method or not? What happens if you wrap another provider type with the extension and call it?

        This is the sort of "what other options do we have" discussion that I was hoping to have here.

        Show
        Larry McCay added a comment - Hi Aaron T. Myers - That intent seems more reasonable to me but I can't say that I completely understand the mechanics there. The crypto extension provides wrapped key functionality by providing the implementation for it that can be used across all providers. How would we make this only for the KMSClientKeyProvider without adding it to the key provider interface. Would it be up to the consumer of the keyprovider to know whether to wrap it with the extension and to call the getDelegationToken method or not? What happens if you wrap another provider type with the extension and call it? This is the sort of "what other options do we have" discussion that I was hoping to have here.
        Hide
        Alejandro Abdelnur added a comment -

        Lets assume you have a DelegationTokenKeyProviderExtension providing a DelegationTokenExtension interface, it would be something like this:

        public class DelegationTokenKeyProviderExtension extends KeyProviderExtension<DelegationTokenExtension> {
        
          public interface DelegationTokenExtension extends Extension {
             public Token<?> getDelegationToken(String renewer) throws IOException;
          }
        
          private DelegationTokenKeyProviderExtension(KeyProvider kp, DelegationTokenExtension dte) {
            super(kp, dte);
          }
        
          public Token<?> getDelegationToken(String renewer) throws IOException {
            Token<?> token = null;
             if (getExtension() != null) {
               token = getExtension().getDelegationToken(renewer);
             }
             return token;
          }
        
          privat static DefaultDelegationTokenExtension implements DelegationTokenExtension {
             public Token<?> getDelegationToken(String renewer) throws IOException {
               return null;
              }
          }
        
          public static DelegationTokenKeyProviderExtension getExtension(KeyProvider kp) {
            DelegationTokenExtension dte = (kp instanceof DelegationTokenExtension) ? (DelegationTokenExtension) kp : null;
            return DelegationTokenKeyProviderExtension(kp, dte);
          }
        }
        

        When using the DelegationTokenKeyProviderExtension to get tokens you get the same semantics as you would do getting the tokens from the getDelegationToken() method if it would be backed in the KeyProvider API but without having the token retrieval in the KeyProvider API itself which is your source of concerns.

        Show
        Alejandro Abdelnur added a comment - Lets assume you have a DelegationTokenKeyProviderExtension providing a DelegationTokenExtension interface, it would be something like this: public class DelegationTokenKeyProviderExtension extends KeyProviderExtension<DelegationTokenExtension> { public interface DelegationTokenExtension extends Extension { public Token<?> getDelegationToken( String renewer) throws IOException; } private DelegationTokenKeyProviderExtension(KeyProvider kp, DelegationTokenExtension dte) { super (kp, dte); } public Token<?> getDelegationToken( String renewer) throws IOException { Token<?> token = null ; if (getExtension() != null ) { token = getExtension().getDelegationToken(renewer); } return token; } privat static DefaultDelegationTokenExtension implements DelegationTokenExtension { public Token<?> getDelegationToken( String renewer) throws IOException { return null ; } } public static DelegationTokenKeyProviderExtension getExtension(KeyProvider kp) { DelegationTokenExtension dte = (kp instanceof DelegationTokenExtension) ? (DelegationTokenExtension) kp : null ; return DelegationTokenKeyProviderExtension(kp, dte); } } When using the DelegationTokenKeyProviderExtension to get tokens you get the same semantics as you would do getting the tokens from the getDelegationToken() method if it would be backed in the KeyProvider API but without having the token retrieval in the KeyProvider API itself which is your source of concerns.
        Hide
        Larry McCay added a comment -

        That seems pretty convoluted.

        Let's step back a second - so that the full usecase is clear.

        • consumers of the managed keys will need access to them from services/tasks at execution time
        • some of the keys will be unknown until file access time
        • so, at job submission time KMS delegation tokens are needed so that the services/tasks can access the required keys as the submitting user later as they discover the need for the specific keys from HDFS ext attrs
        • therefore the delegation tokens have to be in the credentials file
        • they will also need to be made available to the KMSClientKeyProvider to include in the request to the KMS

        So, we need:

        1. the ability to get the KMS delegation token at job submission time
        2. the ability to add it to and get it from the credentials file (already available in Credentials)

        • though it seems that this has to be done by the consuming code not the KMSClientKeyProvider code
          3. the ability to supply the delegation token to the KMSClientKeyProvider when requesting keys

        My questions:

        A. For #1 can't we have a standalone DelegationTokenClient component - especially since there is another jira for refactoring delegation token support out into common to be more reusable? Such a client could then potentially be used inside the KMSClientKeyProvider.
        B. Wouldn't it be better if providers that know they need delegation tokens were able to handle #2 themselves?
        C. How is #3 above going to be handled using the current interfaces - I don't see how it is being added to the interaction currently?
        D. If the KMSClientKeyProvider had access to the credentials object ( already have access to UserKeyProvider) or some other execution context itself then could that be a way that #3 could be addressed?

        Show
        Larry McCay added a comment - That seems pretty convoluted. Let's step back a second - so that the full usecase is clear. consumers of the managed keys will need access to them from services/tasks at execution time some of the keys will be unknown until file access time so, at job submission time KMS delegation tokens are needed so that the services/tasks can access the required keys as the submitting user later as they discover the need for the specific keys from HDFS ext attrs therefore the delegation tokens have to be in the credentials file they will also need to be made available to the KMSClientKeyProvider to include in the request to the KMS So, we need: 1. the ability to get the KMS delegation token at job submission time 2. the ability to add it to and get it from the credentials file (already available in Credentials) though it seems that this has to be done by the consuming code not the KMSClientKeyProvider code 3. the ability to supply the delegation token to the KMSClientKeyProvider when requesting keys My questions: A. For #1 can't we have a standalone DelegationTokenClient component - especially since there is another jira for refactoring delegation token support out into common to be more reusable? Such a client could then potentially be used inside the KMSClientKeyProvider. B. Wouldn't it be better if providers that know they need delegation tokens were able to handle #2 themselves? C. How is #3 above going to be handled using the current interfaces - I don't see how it is being added to the interaction currently? D. If the KMSClientKeyProvider had access to the credentials object ( already have access to UserKeyProvider) or some other execution context itself then could that be a way that #3 could be addressed?
        Hide
        Aaron T. Myers added a comment -

        That seems pretty convoluted.

        In the future, would appreciate you explaining why a proposal seems convoluted. It seems quite straightforward to me, so not sure how to address this comment. This proposal is an attempt to compromise and address your concern, which I understood to be not wanting to have this method baked into the KeyProvider interface, and thus allow some implementations to use it and others to not.

        A. For #1 can't we have a standalone DelegationTokenClient component - especially since there is another jira for refactoring delegation token support out into common to be more reusable? Such a client could then potentially be used inside the KMSClientKeyProvider.

        That JIRA seems to me to be orthogonal to this one, so I don't think we should couple the two. How the KmsClientKeyProvider gets tokens under the hood shouldn't have anything to do with the API. Also, as you point out later in question C, it will still be necessary for the submitting code to somehow call/interact with the tokens/credentials of the KeyProvider at submission time, so I don't think it's actually possible to entirely encapsulate the delegation token fetching/storage within the KeyProvider implementation.

        B. Wouldn't it be better if providers that know they need delegation tokens were able to handle #2 themselves?

        How about changing the proposal to mimic what's done in FileSystem today and add a method like "public Token<?>[] addDelegationTokens(final String renewer, Credentials credentials)" to the KeyProvider API? The default behavior would be to add no tokens to the provided Credentials object, but the KmsClientKeyProvider could instead fetch and stash away the tokens in the provided Credentials object.

        C. How is #3 above going to be handled using the current interfaces - I don't see how it is being added to the interaction currently?

        I believe this will happen transparently, because the tokens contained in the Credentials object will be added to the UGI object which will then be used to authenticate all the RPCs. The KeyProvider shouldn't need access to the tokens in the tasks.

        D. If the KMSClientKeyProvider had access to the credentials object ( already have access to UserKeyProvider) or some other execution context itself then could that be a way that #3 could be addressed?

        If I'm understanding you correctly, I think this basically the same as what I'm proposing above in response to your question B. Am I right about that? Will this work for you?

        Show
        Aaron T. Myers added a comment - That seems pretty convoluted. In the future, would appreciate you explaining why a proposal seems convoluted. It seems quite straightforward to me, so not sure how to address this comment. This proposal is an attempt to compromise and address your concern, which I understood to be not wanting to have this method baked into the KeyProvider interface, and thus allow some implementations to use it and others to not. A. For #1 can't we have a standalone DelegationTokenClient component - especially since there is another jira for refactoring delegation token support out into common to be more reusable? Such a client could then potentially be used inside the KMSClientKeyProvider. That JIRA seems to me to be orthogonal to this one, so I don't think we should couple the two. How the KmsClientKeyProvider gets tokens under the hood shouldn't have anything to do with the API. Also, as you point out later in question C, it will still be necessary for the submitting code to somehow call/interact with the tokens/credentials of the KeyProvider at submission time, so I don't think it's actually possible to entirely encapsulate the delegation token fetching/storage within the KeyProvider implementation. B. Wouldn't it be better if providers that know they need delegation tokens were able to handle #2 themselves? How about changing the proposal to mimic what's done in FileSystem today and add a method like " public Token<?>[] addDelegationTokens(final String renewer, Credentials credentials) " to the KeyProvider API? The default behavior would be to add no tokens to the provided Credentials object, but the KmsClientKeyProvider could instead fetch and stash away the tokens in the provided Credentials object. C. How is #3 above going to be handled using the current interfaces - I don't see how it is being added to the interaction currently? I believe this will happen transparently, because the tokens contained in the Credentials object will be added to the UGI object which will then be used to authenticate all the RPCs. The KeyProvider shouldn't need access to the tokens in the tasks. D. If the KMSClientKeyProvider had access to the credentials object ( already have access to UserKeyProvider) or some other execution context itself then could that be a way that #3 could be addressed? If I'm understanding you correctly, I think this basically the same as what I'm proposing above in response to your question B. Am I right about that? Will this work for you?
        Hide
        Larry McCay added a comment -

        Hey Aaron T. Myers - Sorry for the "convoluted" statement - that came across stronger than I intended. I just really don't like the instanceof check in there to return an instance of an extension or null. I do understand the motivation but think that something simpler would be better.

        Your proposal is actually very similar to my context proposal but limited to delegation tokens. In general, I am in favor of this approach. Do you think that we could make it more generic though? Out of curiosity, why does it return an array of Tokens?

        If we were to open it up to include other things, like keys or passwords, etc then we could just make it an add credentials method call:
        HashMap<String,Object> addToCredentials(HashMap<String,Object> props, Credentials creds)
        or
        HashMap<String,Object> setupCredentials(HashMap<String,Object> props, Credentials creds)

        Where renewer would be in the props when a given provider expects it.
        But we could also include the keyversions of the keys we know about at submission time and they be added.
        We could provide the names of passwords that may be needed by a given provider as well.

        Still not sure how the returned tokens are used in your proposal but they could be returned in the hashmap in this proposal - as well as anything else that would make sense.

        We would just need a couple well-known property names to represent:

        • renewer
        • keyversions
        • passwords
        • returned tokens?

        Does that make any sense?

        Show
        Larry McCay added a comment - Hey Aaron T. Myers - Sorry for the "convoluted" statement - that came across stronger than I intended. I just really don't like the instanceof check in there to return an instance of an extension or null. I do understand the motivation but think that something simpler would be better. Your proposal is actually very similar to my context proposal but limited to delegation tokens. In general, I am in favor of this approach. Do you think that we could make it more generic though? Out of curiosity, why does it return an array of Tokens? If we were to open it up to include other things, like keys or passwords, etc then we could just make it an add credentials method call: HashMap<String,Object> addToCredentials(HashMap<String,Object> props, Credentials creds) or HashMap<String,Object> setupCredentials(HashMap<String,Object> props, Credentials creds) Where renewer would be in the props when a given provider expects it. But we could also include the keyversions of the keys we know about at submission time and they be added. We could provide the names of passwords that may be needed by a given provider as well. Still not sure how the returned tokens are used in your proposal but they could be returned in the hashmap in this proposal - as well as anything else that would make sense. We would just need a couple well-known property names to represent: renewer keyversions passwords returned tokens? Does that make any sense?
        Hide
        Larry McCay added a comment -

        Of course, both of those proposals seem very strange to be part of the UserProvider which actually sits on top of the credentials object. :/

        Show
        Larry McCay added a comment - Of course, both of those proposals seem very strange to be part of the UserProvider which actually sits on top of the credentials object. :/
        Hide
        Aaron T. Myers added a comment -

        Do you think that we could make it more generic though?

        I'm sure we could, but I suggest we cross that bridge when we come to it. Hadoop currently does delegated authentication via DelegationTokens everywhere, so let's do something to support that and move on. If in the future we have need for other stuff, we'll amend the API appropriately. Seems quite premature to me to attempt to design a generic API when we don't have any concrete alternate use-cases.

        Out of curiosity, why does it return an array of Tokens?

        The various callers use it for different things, e.g. in some places just to log which tokens were renewed. I don't think it's actually integral to the functioning of the API, just a convenience.

        If we were to open it up to include other things, like keys or passwords, etc then we could just make it an add credentials method call:<snip>

        In general I'm really leery of a HashMap<String,Object>-based API. That seems quite fragile to me, and very overly-generic for the common use case of just dealing with DTs.

        How about as a way forward with this JIRA we go with the "public Token<?>[] addDelegationTokens(final String renewer, Credentials credentials)" added to KeyProvider as I proposed, and revisit a more generic API in the future when we actually have a concrete need for it? We could then perhaps later add a "addAdditionalCredentials" API call or something to accommodate non-DT-based implementations. It is *soft*ware, after all.

        Show
        Aaron T. Myers added a comment - Do you think that we could make it more generic though? I'm sure we could, but I suggest we cross that bridge when we come to it. Hadoop currently does delegated authentication via DelegationTokens everywhere, so let's do something to support that and move on. If in the future we have need for other stuff, we'll amend the API appropriately. Seems quite premature to me to attempt to design a generic API when we don't have any concrete alternate use-cases. Out of curiosity, why does it return an array of Tokens? The various callers use it for different things, e.g. in some places just to log which tokens were renewed. I don't think it's actually integral to the functioning of the API, just a convenience. If we were to open it up to include other things, like keys or passwords, etc then we could just make it an add credentials method call:<snip> In general I'm really leery of a HashMap<String,Object> -based API. That seems quite fragile to me, and very overly-generic for the common use case of just dealing with DTs. How about as a way forward with this JIRA we go with the " public Token<?>[] addDelegationTokens(final String renewer, Credentials credentials) " added to KeyProvider as I proposed, and revisit a more generic API in the future when we actually have a concrete need for it? We could then perhaps later add a " addAdditionalCredentials " API call or something to accommodate non-DT-based implementations. It is *soft*ware, after all.
        Hide
        Larry McCay added a comment -

        Well, it isn't much different than the original getDelegationToken proposal this way.

        So - you don't think that it makes sense to add a method that can move a list of specified keyversions into the credentials object?
        That seems to imply that all keys will be fetched at runtime rather than those we know about at submission time being added then.

        Incidentally, we wouldn't have to make that generic - we could come up with a type safe context that includes the same properties:

        • renewer
        • keyversions
        • passwords (can leave this one out until we need it)
        • returned tokens (if they are needed)

        Anyway, I've beaten this one to death.
        Thanks for accommodating my nit-picking.

        I think that I'll let Owen O'Malley weigh in when he is back.

        Show
        Larry McCay added a comment - Well, it isn't much different than the original getDelegationToken proposal this way. So - you don't think that it makes sense to add a method that can move a list of specified keyversions into the credentials object? That seems to imply that all keys will be fetched at runtime rather than those we know about at submission time being added then. Incidentally, we wouldn't have to make that generic - we could come up with a type safe context that includes the same properties: renewer keyversions passwords (can leave this one out until we need it) returned tokens (if they are needed) Anyway, I've beaten this one to death. Thanks for accommodating my nit-picking. I think that I'll let Owen O'Malley weigh in when he is back.
        Hide
        Larry McCay added a comment -

        After some more thought on this, I think the way forward may best be the extension pattern.
        This leaves the responsibility of fetching tokens and/or populating the credentials object outside of the the key provider but available for providers that explicitly call it out.

        Show
        Larry McCay added a comment - After some more thought on this, I think the way forward may best be the extension pattern. This leaves the responsibility of fetching tokens and/or populating the credentials object outside of the the key provider but available for providers that explicitly call it out.
        Hide
        Alejandro Abdelnur added a comment -

        Larry McCay,

        So - you don't think that it makes sense to add a method that can move a list of specified keyversions into the credentials object? That seems to imply that all keys will be fetched at runtime rather than those we know about at submission time being added then.

        After reading your question, I wonder if this was not the disconnect in the discussion.

        This patch is not about adding keys itself to the credentials, but for delegation token for tasks to be able to interact with the keyprovider without a kerberos session.

        Your self answer is right on the usecase we are interested here.

        On making things generic, I see the merit of that, though I don’t think is the scope of this JIRA.

        Sure, we’ll go with the extension pattern then.

        Arun Suresh, I think the right method to have in the extension, as Aaron T. Myers pointed out, is addDelegationTokens() with similar to signature to the FileSystem API.

        Show
        Alejandro Abdelnur added a comment - Larry McCay , So - you don't think that it makes sense to add a method that can move a list of specified keyversions into the credentials object? That seems to imply that all keys will be fetched at runtime rather than those we know about at submission time being added then. After reading your question, I wonder if this was not the disconnect in the discussion. This patch is not about adding keys itself to the credentials, but for delegation token for tasks to be able to interact with the keyprovider without a kerberos session. Your self answer is right on the usecase we are interested here. On making things generic, I see the merit of that, though I don’t think is the scope of this JIRA. Sure, we’ll go with the extension pattern then. Arun Suresh , I think the right method to have in the extension, as Aaron T. Myers pointed out, is addDelegationTokens() with similar to signature to the FileSystem API.
        Alejandro Abdelnur made changes -
        Field Original Value New Value
        Summary Add getDelegationToken() method to KeyProvider Create KeyProvider extension to handle delegation tokens
        Hide
        Arun Suresh added a comment -

        Added patch with KeyProviderDelegationTokenExtension and a no-op implementation of DelegationTokenExtension

        Show
        Arun Suresh added a comment - Added patch with KeyProviderDelegationTokenExtension and a no-op implementation of DelegationTokenExtension
        Arun Suresh made changes -
        Attachment HADOOP-10769.1.patch [ 12654137 ]
        Arun Suresh made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Alejandro Abdelnur added a comment -

        a couple of comments on the patch:

        • the class javadoc should say "A KeyProvider extension"
        • add a simple testcase with a mock KP implementing the interface to verify the create, also a test for a non-implementing KP
        Show
        Alejandro Abdelnur added a comment - a couple of comments on the patch: the class javadoc should say "A KeyProvider extension" add a simple testcase with a mock KP implementing the interface to verify the create, also a test for a non-implementing KP
        Hide
        Arun Suresh added a comment -

        Updated patch addressing feedback..

        Show
        Arun Suresh added a comment - Updated patch addressing feedback..
        Arun Suresh made changes -
        Attachment HADOOP-10769.2.patch [ 12654140 ]
        Hide
        Alejandro Abdelnur added a comment -

        +1 pending jenkins

        Show
        Alejandro Abdelnur added a comment - +1 pending jenkins
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12654137/HADOOP-10769.1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4213//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4213//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654137/HADOOP-10769.1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4213//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4213//console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12654140/HADOOP-10769.2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        -1 release audit. The applied patch generated 1 release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4214//testReport/
        Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/4214//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4214//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654140/HADOOP-10769.2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit . The applied patch generated 1 release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4214//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/4214//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4214//console This message is automatically generated.
        Hide
        Arun Suresh added a comment -

        Added license header to test-case

        Show
        Arun Suresh added a comment - Added license header to test-case
        Arun Suresh made changes -
        Attachment HADOOP-10769.3.patch [ 12654143 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12654143/HADOOP-10769.3.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4215//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4215//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654143/HADOOP-10769.3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4215//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4215//console This message is automatically generated.
        Hide
        Aaron T. Myers added a comment -

        +1, the latest patch looks good to me. I'm going to commit this momentarily.

        Show
        Aaron T. Myers added a comment - +1, the latest patch looks good to me. I'm going to commit this momentarily.
        Hide
        Aaron T. Myers added a comment -

        I've just committed this to trunk.

        Thanks a lot for the contribution, Arun. Thanks also to Tucu and Larry for the discussion and reviews.

        Show
        Aaron T. Myers added a comment - I've just committed this to trunk. Thanks a lot for the contribution, Arun. Thanks also to Tucu and Larry for the discussion and reviews.
        Aaron T. Myers made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Fix Version/s 3.0.0 [ 12320357 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in Hadoop-trunk-Commit #5830 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5830/)
        HADOOP-10769. Create KeyProvider extension to handle delegation tokens. Contributed by Arun Suresh. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1608286)

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderDelegationTokenExtension.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderDelegationTokenExtension.java
        Show
        Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #5830 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5830/ ) HADOOP-10769 . Create KeyProvider extension to handle delegation tokens. Contributed by Arun Suresh. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1608286 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderDelegationTokenExtension.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderDelegationTokenExtension.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Yarn-trunk #606 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/606/)
        HADOOP-10769. Create KeyProvider extension to handle delegation tokens. Contributed by Arun Suresh. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1608286)

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderDelegationTokenExtension.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderDelegationTokenExtension.java
        Show
        Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk #606 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/606/ ) HADOOP-10769 . Create KeyProvider extension to handle delegation tokens. Contributed by Arun Suresh. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1608286 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderDelegationTokenExtension.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderDelegationTokenExtension.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #1797 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1797/)
        HADOOP-10769. Create KeyProvider extension to handle delegation tokens. Contributed by Arun Suresh. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1608286)

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderDelegationTokenExtension.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderDelegationTokenExtension.java
        Show
        Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1797 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1797/ ) HADOOP-10769 . Create KeyProvider extension to handle delegation tokens. Contributed by Arun Suresh. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1608286 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderDelegationTokenExtension.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderDelegationTokenExtension.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #1824 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1824/)
        HADOOP-10769. Create KeyProvider extension to handle delegation tokens. Contributed by Arun Suresh. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1608286)

        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderDelegationTokenExtension.java
        • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderDelegationTokenExtension.java
        Show
        Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1824 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1824/ ) HADOOP-10769 . Create KeyProvider extension to handle delegation tokens. Contributed by Arun Suresh. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1608286 ) /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/KeyProviderDelegationTokenExtension.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/key/TestKeyProviderDelegationTokenExtension.java
        Alejandro Abdelnur made changes -
        Fix Version/s 2.6.0 [ 12327179 ]
        Fix Version/s 3.0.0 [ 12320357 ]
        Arun C Murthy made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Patch Available Patch Available
        3d 1h 3m 1 Arun Suresh 04/Jul/14 21:25
        Patch Available Patch Available Resolved Resolved
        1d 22h 52m 1 Aaron T. Myers 06/Jul/14 20:18
        Resolved Resolved Closed Closed
        147d 7h 52m 1 Arun C Murthy 01/Dec/14 03:10

          People

          • Assignee:
            Arun Suresh
            Reporter:
            Alejandro Abdelnur
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development