Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3189

Address scalability issue with N^2 KDC requests on cluster startup

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When Impala runs a query that shuffles data amongst all nodes in a Kerberos-secured cluster, every node will need to acquire a TGS for every other node. In a cluster of 100 nodes or more, this can overwhelm the KDC, and queries can exit with an error ("Could not contact KDC for realm").

      A simple workaround is to run a warm-up query until it succeeds (which can take a few minutes after cluster startup). The KDC can also be scaled (e.g. with secondary KDC nodes).

      Impala can also consider either forcing a TGS request on start-up in a staggered fashion, or we can move to recommending SSL + client certificates for server<->server communication.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            henryr Henry Robinson

            Dates

              Created:
              Updated:

              Slack

                Issue deployment