Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-9533

Centralized Hadoop SSO/Token Server

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • security
    • None
    • security


      This is an umbrella Jira filing to oversee a set of proposals for introducing a new master service for Hadoop Single Sign On (HSSO).

      There is an increasing need for pluggable authentication providers that authenticate both users and services as well as validate tokens in order to federate identities authenticated by trusted IDPs. These IDPs may be deployed within the enterprise or third-party IDPs that are external to the enterprise.

      These needs speak to a specific pain point: which is a narrow integration path into the enterprise identity infrastructure. Kerberos is a fine solution for those that already have it in place or are willing to adopt its use but there remains a class of user that finds this unacceptable and needs to integrate with a wider variety of identity management solutions.

      Another specific pain point is that of rolling and distributing keys. A related and integral part of the HSSO server is library called the Credential Management Framework (CMF), which will be a common library for easing the management of secrets, keys and credentials.

      Initially, the existing delegation, block access and job tokens will continue to be utilized. There may be some changes required to leverage a PKI based signature facility rather than shared secrets. This is a means to simplify the solution for the pain point of distributing shared secrets.

      This project will primarily centralize the responsibility of authentication and federation into a single service that is trusted across the Hadoop cluster and optionally across multiple clusters. This greatly simplifies a number of things in the Hadoop ecosystem:

      1. a single token format that is used across all of Hadoop regardless of authentication method
      2. a single service to have pluggable providers instead of all services
      3. a single token authority that would be trusted across the cluster/s and through PKI encryption be able to easily issue cryptographically verifiable tokens
      4. automatic rolling of the token authority’s keys and publishing of the public key for easy access by those parties that need to verify incoming tokens
      5. use of PKI for signatures eliminates the need for securely sharing and distributing shared secrets

      In addition to serving as the internal Hadoop SSO service this service will be leveraged by the Knox Gateway from the cluster perimeter in order to acquire the Hadoop cluster tokens. The same token mechanism that is used for internal services will be used to represent user identities. Providing for interesting scenarios such as SSO across Hadoop clusters within an enterprise and/or into the cloud.

      The HSSO service will be comprised of three major components and capabilities:

      1. Federating IDP – authenticates users/services and issues the common Hadoop token
      2. Federating SP – validates the token of trusted external IDPs and issues the common Hadoop token
      3. Token Authority – management of the common Hadoop tokens – including:
      a. Issuance
      b. Renewal
      c. Revocation

      As this is a meta Jira for tracking this overall effort, the details of the individual efforts will be submitted along with the child Jira filings.

      Hadoop-Common would seem to be the most appropriate home for such a service and its related common facilities. We will also leverage and extend existing common mechanisms as appropriate.


        1. HSSO-Interaction-Overview-rev-1.docx
          136 kB
          Larry McCay
        2. HSSO-Interaction-Overview-rev-1.pdf
          341 kB
          Larry McCay

        Issue Links


          This comment will be Viewable by All Users Viewable by All Users


            lmccay Larry McCay
            lmccay Larry McCay



              Time Tracking

                Original Estimate - 1,176h
                Remaining Estimate - 1,176h
                Time Spent - Not Specified
                Not Specified


                  Issue deployment