Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4796

Authenticate with Kerberos using a keytab file

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.15.0
    • Fix Version/s: 0.16.0
    • Component/s: None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Support for logging in into a Kerberos secured Hadoop cluster using a keytab file to allow running jobs that last longer than the maximum Kerberos ticket lifetime.
    • Tags:
      kerberos

      Description

      When running in a Kerberos secured environment users are faced with the limitation that their jobs cannot run longer than the (remaining) ticket lifetime of their Kerberos tickets. The environment I work in these tickets expire after 10 hours, thus limiting the maximum job duration to at most 10 hours (which is a problem).

      In the Hadoop tooling there is a feature where you can authenticate using a Kerberos keytab file (essentially a file that contains the encrypted form of the kerberos principal and password). Using this the running application can request new tickets from the Kerberos server when the initial tickets expire.

      In my Java/Hadoop applications I commonly include these two lines:

      System.setProperty("java.security.krb5.conf", "/etc/krb5.conf");
      UserGroupInformation.loginUserFromKeytab("nbasjes@XXXXXX.NET", "/home/nbasjes/.krb/nbasjes.keytab");
      

      This way I have run an Apache Flink based application for more than 170 hours (about a week) on the kerberos secured Yarn cluster.

      What I propose is to have a feature that I can set the relevant kerberos values in my pig script and from there be able to run a pig job for many days on the secured cluster.

      Proposal how this can look in a pig script:

      SET java.security.krb5.conf '/etc/krb5.conf'
      SET job.security.krb5.principal 'nbasjes@XXXXXX.NET'
      SET job.security.krb5.keytab '/home/nbasjes/.krb/nbasjes.keytab'
      

      So iff all of these are set (or at least the last two) then the aforementioned UserGroupInformation.loginUserFromKeytab method is called before submitting the job to the cluster.

        Attachments

        1. PIG-4796-4.patch
          10 kB
          Niels Basjes
        2. PIG-4796-2016-02-23.patch
          10 kB
          Niels Basjes
        3. 2016-02-18-PIG-4796-rough-proof-of-concept.patch
          7 kB
          Niels Basjes
        4. 2016-02-18-1510-PIG-4796.patch
          4 kB
          Niels Basjes

          Activity

            People

            • Assignee:
              nielsbasjes Niels Basjes
              Reporter:
              nielsbasjes Niels Basjes
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: