Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-7273

JHS: make sure that Kerberos relogin is performed when KDC becomes offline then online again

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.10.0, 3.2.1, 3.1.3
    • Fix Version/s: 3.4.0
    • Component/s: jobhistoryserver
    • Labels:
      None
    • Target Version/s:

      Description

      In JHS, if the KDC goes offline, the IPC layer does try to relogin, but it's not always enough. You have to wait for 60 seconds for the next retry. In the meantime, if the KDC comes back, the following error might occur:

      2020-04-09 03:27:52,075 DEBUG ipc.Server (Server.java:processSaslToken(1952)) - Have read input token of size 708 for processing by saslServer.evaluateResponse()
      2020-04-09 03:27:52,077 DEBUG ipc.Server (Server.java:saslProcess(1829)) - javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Invalid argument (400) - Cannot find key of appropriate type to decrypt AP REP - AES128 CTS mode with HMAC SHA1-96)]
              at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199)
      ...
      

      When this happens, JHS has to be restarted.

        Attachments

        1. MAPREDUCE-7273-001.patch
          2 kB
          Peter Bacsko
        2. MAPREDUCE-7273-002.patch
          2 kB
          Peter Bacsko

          Activity

            People

            • Assignee:
              pbacsko Peter Bacsko
              Reporter:
              pbacsko Peter Bacsko
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: