Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13604

Abort retry loop when RPC has an unrecoverable Auth error

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      I've seen an issue where, after an RPC client hit an error obtaining a TGT from Kerberos, the RPC client continues to retry even though there's no chance of success (the no login window is set to 600s).

      In this particular deployment, the client retries 15 times at 15s intervals, leading to a delay of more than three minutes before the failure is bubbled up to the client when the RPC ultimately fails.

      Unrecoverable errors (like failures to login to Kerberos) should lead to fast aborts of the retry loop.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                xiaochen Xiao Chen
                Reporter:
                henryr Henry Robinson
              • Votes:
                1 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated: